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Abstract 



Variable length communication over a compound channel with feedback is considered. Tradi- 
tionally, capacity of a compound channel without feedback is defined as the maximum rate that 
is determined before the start of communication such that communication is reliable. This tradi- 
tional definition is pessimistic. In the presence of feedback, an opportunistic definition is given. 
Capacity is defined as the maximum rate that is determined at the end of communication such 
that communication is reliable. Thus, the transmission rate can adapt to the realized channel. 
Under this definition, feedback communication over a compound channel is conceptually similar 
to multi-terminal communication. Transmission rate is a vector rather than a scalar; channel 
capacity is a region rather than a scalar; error exponent is a region rather than a scalar. In this 
paper, variable length communication over a compound channel with feedback is formulated, its 
opportunistic capacity region is characterized, and lower bounds for its error exponent region 
are provided. 

1 Introduction 

The compound channel, first considered by Wolfowitz [I] and Blackwell et. al. [2], is one of the 
simplest extensions of the DMC (discrete memoryless channel). In a compound channel, the channel 
transition matrix Q Q belongs to a family £2 that is defined over a common discrete input and discrete 
output alphabets 3£ and *3f . The transmitter and the receiver know the compound family JS but 
do not know the realized channel Q ; the realized channel Q Q does not change with time. We are 
interested in characterizing the error exponents of a compound channel used with feedback. For 
that purpose, we define a new notion of the capacity of the compound channel with feedback. 

There have been comprehensive investigations on the capacity of compound channels, used both 
with and without feedback. In addition, there is some work on characterizing the error exponent 
of compound channels used with feedback. We briefly summarize the existing work below, focusing 
on finite compound families J2 = {Qi, ■ ■ ■ ,Ql}- 

Given a coding scheme defined over a compound family J2, let Pi and denote the 
probability of error and transmission rate when the realized channel Q is Qi, I = 1, . . . , L. The 
general notion of capacity of a compound channel is as follows: a rate R is said to be achievable if 
Ve > 0, 3 a sequence of coding schemes such that pi' %) < e and > R - e, t = 1, . . . , L. 
Then, the capacity is the supremum of all achievable rates. This same notion applies when the 
channel is used without or with feedback (the difference being in the choice of coding schemes ) . 

When the compound channel is used without feedback, the capacity is given by (see [3]) 




max mi 

PeA(,sr) QeS 
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where A(^T) is the space of probability distributions on input alphabet X and 



is the mutual information between the input and output of a channel with input distribution P 
and channel transition matrix Q. When the compound channel is used with feedback, the capacity 
is given by (see H]) 

V F (&)=w£ max I(P,Q) (2) 
Qe=S PeA(S') 

These and other variations of the compound channel are surveyed in [5] . 

The above notion of capacity is pessimistic. It quantifies the maximum rate determined before 
the start of transmission such that communication is reliable over every realized channel Q Q . An 
opportunistic definition of feedback is possible in the presence of feedback. 

For many applications, network traffic is backlogged and a rate guarantee before the start of 
transmission is not critical. Rather, we want to communicate at the maximum rate while ensuring 
that communication is reliable for the realized channel Q Q (even though Q Q is not known to the 
transmitter or the receiver before the start of transmission). In particular, instead of modeling 
achievable rate as a scalar value R that is guaranteed before the start of communication, we model 
achievable rate as a vector (R\, . . . , Rl) such that the rate of communication is Ri when the realized 
channel is Qe. In addition, communication is reliable for every realized channel. More precisely, 
we say that a rate vector (Ri, ■ ■ ■ ,Rl) is opportunistically achievable if Ve > 0, 3 a sequence 
of coding schemes such that pj^ < e and R^ > Rg — e, £ = 1, . . . , L. We define the union of all 
opportunistically achievable rates as the opportunistic capacity region c €of(&), 

^of(^) = {(Ri, ■ ■ ■ , Rl) '■ (Ri, ■ ■ ■ , Rl) is opportunistically achievable}. (3) 

We formally define opportunistically achievable rates and opportunistic capacity in Section [2} 

Let Cg denote the capacity of DMC Qg, I = 1, . . . , L. Then, it is straight forward to show (see 
Corollary [TJ that the opportunistic capacity region is given by a hyper-rectangle 

VoF(£) = {(Ri,---,RL):0<Ri<C t , £ = 1,...,L}, 

which is determined by just its upper corner (Ci, . . . ,Cl)- Thus, the capacity region ^of(^) is 
equivalent to the capacity vector "jfg := (C\, . . . , Cl). 

In this paper, we consider variable length coding schemes. For a sequence {SM} of coding 
schemes that (opportunistically) achieves a rate vector (i?i, . . . , Rl), we define the error exponent 
vector (Ei, . . . , El) as 

p r -log^ (n) 
Et = lim , * 

where E/ [rW] is the expected length of the coding scheme when the realized channel is Qi. 
The union of all achievable error exponent vectors is defined as the error exponent region (EER) at 
rate (R±, . . . , Rl) and denoted by S(R\, . . ■ , Rl)- The formal definition is presented in Section[2j 
Consider a DMC Q used with feedback. Let Cq denote its capacity. The error exponent of 
variable length coding scheme at rate R < Cq is given by (see (6l) 

E B (R,Q) = B Q (1-R/C Q ), (4) 
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where 



B Q = max b Q (x A ,x R ), (5) 
oq{xa,x r ) = D(Q(-\xa)\\Q(-\x b )), (6) 
Q(-\x) is the probability distribution of the channel output when the channel input is x, and 

D( P \\ q ) = J2p(y)^ P r\ 

is the Kullback-Leibler divergence between probability distributions p and q. We call E R (R, Q) as 
the Burnashev exponent of channel Q at rate R and -Bq as the zero rate Burnashev exponent. 

One of the key features of the Burnashev exponent is that it has a non-zero slope at capacity. This 
slope captures the main advantage of feedback — by reducing the transmission rate by a small frac- 
tion of the capacity, we linearly increase the error exponent, and therefore, exponentially decrease 
the probability of error. Does feedback provide the same advantage for a compound channel? 

Clearly, a particular component E% of the EER of the compound channel cannot beat the Bur- 
nashev exponent for DMC Qi. Thus, a trivial upper bound for the EER at rate (Ri, ■ ■ ■ ,Rl) £ 
^of(^) is the hyper-rectangle with upper corner 

(B Ql (1 - Ri/C Ql ) ,...,B Ql (1- Rl/C Ql )) (7) 

Tchamkerten and Telatar [7] showed that this bound is not tight by means of a simple counterex- 
ample. They considered a compound family consisting of two binary symmetric channels with 
complementary cross-over probabilities, p and (1 — p), where p is known to the transmitter and the 
receiver. They showed that, even for this simple family, no coding scheme universally achieves the 
Burnashev exponent. 

Another way to interpret that result is that the EER need not be a hyper-rectangle i.e., for a 
fixed rate R = (R\, . . . , Rl) if (E[, . . . , E' L ), (E'(, . . . , E" L ) G ^(R), then it is not necessary that 

(max(£i, E'l), . . . , max^, E'[)) G <?(R). 

Thus, different sequence of coding schemes that achieve the same rate vector (Ri, . . . , -Rl) may have 
different and non- comparable error exponents. Thus, in terms of error exponents, the compound 
channel with feedback behaves in a manner similar to multi-terminal communication channels [8]. 

Tchamkerten and Telatar [7] also identified necessary and sufficient conditions on the compound 
family £2 under which the upper bound of ([7]) is tight for all rates along the principle diagonal 
('jCq 1 , . . . ,jCq l ), < 7 < 1, of the opportunistic capacity region. For channels that do not 
satisfy these conditions, the EER is not characterized. Even when these conditions are satisfied, 
the EER is not characterized for rate vectors that are off the principle diagonal (i.e. Re/CQ e is 
not constant for all £ = 1, . . . , L). In Section [5J we present a coding scheme for all rates in the 
opportunistic capacity region. This scheme achieves an error exponent with a non-zero slope at all 
points in the rate region, including points near the capacity boundary. This shows that feedback 
provides similar advantage for a compound channel as for a DMC. 

Notation 

We use the following notation in this paper. A(^T) denotes the space of probability distributions 
over !% ' . IN denotes the set of natural numbers. P(-) denotes the probability of an event, E[-] denotes 
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the expectation of a random variable, and !{•} denotes the indicator function. All logarithms are 
to the base 2, and exp 2 (-) denotes 2^'\ 

Cq denotes the capacity of the of a DMC with transition matrix Q; Bq denotes its zero-rate Bur- 
nashev exponent. Given a compound family = {Qi, . . . , Ql}, Qo denotes the realized channel; 
Ci denotes the capacity Cq ( of DMC Qi\ Bg denotes the zero-rate Burnashev exponent of DMC 
Qg. P^(-) is short hand for P(-|Q = Qg); and E^[-] is a short hand for E^[-|Q = Qg]. 

2 Opportunistic capacity and error exponents 

In this section we formally define opportunistic capacity and error exponent regions for a compound 
channel with feedback. Conceptually, it is easier to first define achievable rate vector for fixed length 
communication and then extend that definition to variable length communication. However, for 
succinctness, we only define achievable rate vector for variable length communication. 

Definition 1 (Variable-rate variable-length coding scheme) A variable-rate variable-length 
coding scheme for communicating over a compound channel =S = {Qi, ■ ■ ■ ,Ql} with feedback is a 
tuple (M, f , g, r) where 

• M = (Mi, . . . , Mi) is the compound message size where Mg G IN, t = 1,...,L. Define 

= \[ L l=l {\,...,M l }. 

• f = (/i, /2, . . . ) is the encoding strategy where 

ff.Jtx t-> X, t G M 

is the encoding function used at time t. 

• g = {.Qi 1 92, ■ ■ ■ ) is the decoding strategy where 

L 

9t \J{(£,l),(e,2),...,(e,M e )}, f € K 

i=i 

is the decoding function at time t. 

• r is the stopping time with respect to the channel outputs Y t . More precisely, r is a stopping 
time with respect to the filtration {2^ , t G W}. □ 

The coding scheme is known to both the transmitter and the receiver. Variable length communi- 
cation takes place as follows. A compound message W = {W%, . . . , Wl) is generated such that Wg 
is uniformly distributed in {1, . . . , M^}Q The transmitter uses the encoding strategy (/i, /2, • • • ) 
to generate channel inputs 

Xi = /i(W), X 2 = / 2 (W,Yi), ••• 

until the stopping time r with respect to the channel outputs, (r is known to the transmitter 
because of feedback.) The decoder then generates a decoding decision 

{W,L)=g T {Y 1 ,...,Y T ). 

The decoding decision consists of two components: the index L of decoded component and an 
estimate W of the //-component of the compound message W. A communication error occurs if 

w + w L . 

X A11 the probabilities of interest only depend on the marginal distributions of Wi, . . . , Wl- So, the joint distribution 
of (Wi, • • • , Wl) need not be specified. 
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Remark 1 The above scheme is a variable-rate variable-length coding scheme. The transmitter 
and receiver agree upon the set of rates {Pi, ■ ■ ■ , Rl} before the start of communication. The 
transmitter chooses L different messages, one message for each rate; At the end of communication, 
the receiver decides the message L it wants to decode and generates an estimate W for that message. 
Because of noiseless feedback, the encoder knows what the decoder decoded. In principle, the index 
L need not be the same as the index I of the realized channel. For that reason, {L ^ £} is not 
considered a communication error. □ 

The two main performance metrics of a coding scheme are its error probability and rate, both of 
which are vectors (rather than scalars), and denoted by P = (Pi, . . . , P£) and R = (Pi, . . . , Rl), 
respectively. These are defined as follows. 

Definition 2 (Probability of error) A communication error occurs when W ^ WV. The prob- 
ability of error P = (Pi, . . . , Pl) of a coding scheme (M, f , g, r) is given by 

where P^(-) is a short hand notation for P(-|Q = Qe)- □ 
Definition 3 (Rate) The rate R = (Ri, ■ ■ ■ , Rl) of a coding scheme (M, f , g, r) is given by 



Ri 



E/pogM^] 



Mr] 

where E^[-] is a short hand notation for E[-|Q = Qe]- □ 

Remark 2 The above scheme is a variable rate communication scheme. The size M L of the 
communicated message Wt is a random variable taking values in {Mi, . . . ,Ml}- For that reason, 
we define the rate as E^logM^J/E^r]. When all rates {Pi, . . . , Rl} are equal, the above scheme 
reduces to a fixed-rate variable-length coding scheme and the definition of rate in Definition [4] 
collapses to the traditional definition of fixed-rate variable-length coding. □ 

Rate and probability of error give rise to two asymptotic performance metrics, viz., opportunis- 
tically achievable rate and error exponents. These are defined as follows. 

Definition 4 (Opportunistically achievable rate) A rate vector R = (Pi, . . . ,Rl) is said to 
be opportunistically achievable if there exists a sequence of variable-rate variable-length coding 
schemes (MW, f <», g<», t<»), n G M such that: 

1. lining E £ [t^] = oo for I = 1, . . . ,L. 

2. For every e > 0, there exists a n D (e) so that for every n > n Q (e), we have 



or equivalently, 



Pi n) < e and R^ ] > R e - e, for alH = 1, . . . , L; 



lim P, (n) = and lim R^ = R t . 



Definition 5 (Opportunistic Capacity) The union of all opportunistically achievable rates is 
called the opportunistic capacity region of the compound channel 21 with feedback and denoted by 
Vof(£). □ 
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In Corollary 1, we show that ^of^-Q) is given by a hyper-rectangle with upper corner (Cq 1 , . . . , Cq l ). 
For that reason, we call := (Cq 1 , . . . , Cq l ) as the capacity vector of the compound channel £i. 

The variable-rate variable-length coding scheme defined above is related to the notion of rateless 
codes used in fountain codes [9f|ll| for BER (binary erasure channel). 



Definition 6 (Error exponent) Given a sequence of coding schemes (M^ n \ f ( n \ g^ n \ t^), n 6 
IN, that achieve a rate vector R, the asymptotic exponent Eg of error probability Pi is given by 

Ei = hm r . r-rr . 

n->oo E^rW] 

Then E = (£i , . . . , £?i) is the error exponent of the sequence of coding schemes (M^ 71 ) , f ( n ) , g( n ) , ) , 

iieH. □ 

Definition 7 (Error exponent region) For a particular rate R, the union of all possible error 
exponents is called the the error exponent region (EER) of a compound channel with feedback and 
denoted by <f(R). □ 

In this paper, we study the EER for all rates in the opportunistic capacity region and present 
lower bounds on the EER. 

The above scheme describes a variable-rate variable-length coding scheme; varying the rate of 
the coding scheme allows for an additional degree of freedom. This additional freedom does not 
affect the opportunistic capacity region of compound channel; all rates within ^o^(J2) defined 
above can be achieved using a fixed-rate variable length coding scheme. We do not know if this 
additional degree of freedom improves the EER since the EER of a compound channel has not been 
investigated using the traditional fixed-rate variable-length coding scheme. The reason that we 
chose a variable-rate coding scheme is that this additional degree of freedom significantly simplifies 
the coding scheme. 



Operational interpretation 

A transmitter has to reliably communicate an infinite bit stream, which is generated by a higher- 
layer application, to a receiver over a compound channel with feedback. The transmitter uses a 
variable-rate variable- length coding scheme (M,f, g, r). For ease of exposition, assume that every 
Mi, £ = 1, . . . , L, is a power of 2 so that logiW^ is an integer. Let M* = max{Mi, . . . , Ml} and 
M* = min{Mi, . . . , Ml}. The transmitter picks logM* bits from the bit stream. The decimal 
expansion of the first logM^ of these bits determine the component W$ of W. The message W is 
transmitted as described above. At stopping time r the receiver passes (L, W) to a higher-layer 
application (which then converts W to bits) and the transmitter removes the first log M L bitsfrom 
the log M* initially chosen bits and return the remaining log M* — log M L bits to the bit stream. 
Then, the above process is repeated. 

If the traditional pessimistic approach is followed, only logM* bits are removed from the bit 
stream at each stage. By following the opportunistic approach, with high probability logM^ bits 
are removed from the bit stream when the realized channel Q a is Qi. By definition, Mi > M*. 
Thus, by defining capacity in an opportunistic manner, an additional logM^ — logM* bits are 
removed at each step. 
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A trivial outer bound on error exponents 

Any coding scheme (M, f , g, r) for communicating over a compound channel can also be used 
to communicate over DMC Qi. Hence, we have the following trivial upper bound on the EER. 

Proposition 1 For any variable-rate variable-length coding scheme for communicating over £} 
at rate (R\, ■ ■ ■ , Rl), each component of the error exponent region is bounded by the Burnashev 
exponent of channel Qi, i.e., 

Eg < Bq ( (1 - Ri/ Cq ( ) □ 
In the remainder of the paper, we try to derive a reasonable lower bound on the EER. 

3 The coding scheme 

In this section, we define a family of variable-rate variable-length coding schemes indexed by n £ IN. 
As n — > oo, the scheme opportunistically achieves a rate vector (R\, . . . , Rl)- This coding scheme 



is based on the Yamamoto-Itoh 12 scheme that achieves the Burnashev exponent for DMC. 



3.1 Parameters of the coding scheme 

For each n € IN, the scheme is parameterized by the following non-negative real constants^] 

dm) ' ^ ' and f3 m £■> P c £ i i £ = 1, . . . , L. 

(n) 

We will explain the purpose and choice of these constants later. For now, we assume that a m , 
OJc 71 ), an d f^cl are chosen such that n, a^n, P^\n and (3^n are integers. When there is 
no ambiguity, we will not explicitly show the dependence on n and drop the superscripts ( n \ 
For each n, the encoder and the decoder agree upon the following: 

1. Two training sequences, a m and a c of lengths a m n and a c n and corresponding channel 
estimation rules m and 8 C . 

2. L codebooks; one for each Q^, i = 1, . . . ,L. Codebook I has rate £,iRi/ f} m ,i an d length 

3. 2L control sequences; two for each Qi, £ = 1, . . . ,L, viz^ o~a,i and an/, both of length 
/3 Ct £n and corresponding hypothesis testing rules for disambiguating ga,i an d o~ri 
over DMC Q e . 

A compound message W^™) is chosen at random such that component W« , I = 1,...,L, is 
uniformly distributed over {1, . . . , exp 2 (n^i?^)}|^] 

3.2 Operation of the coding scheme 

The coding scheme transmits in multiple epochs indexed by k G INT. Each epoch consists of four 
phases: 



2 The subscripts stand for message and control. 

3 The subscripts stand for accept and reject. 

4 The joint distribution of (W[ n \ . . . , W^) does not matter. 
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1. A fixed length training phase of length a m n. During this phase the transmitter sends 
the training sequence a m ; both the transmitter and the receiver use the estimation rule 9 m 
to determine a channel estimate L m = L m (k). 

2. A variable length message phase of length (3 m ^ n. The transmitter and receiver use 

codebook L m to send component L m of the compound message WH Let W m (k) denote the 
transmitted message and W m (k) the decoded message. 

3. A fixed length re-training phase of length a c n. During this phase the transmitter sends 
the training sequence a c ; both the transmitter and the receiver use the estimation rule C to 
determine a channel estimate L c = L c (k). 

4. A variable length control phase of length (3 c ^ n. If W m (k) = W m (k), the transmitter 
sends a control message W c (k) = a A ^ ; otherwise it sends W c (k) = & R i ■ The receiver 

decodes the control message using 9 H ^ ■ Let W c (k) denote the estimated control message. 

If W c (k) = o~al ■> then transmission stops and the receiver declares (L m (k) , W m (k)) as its final 
decision; otherwise, the compound message is retransmitted in the next epoch. Let denote 
the epoch when communication stops, i.e., 

= inf{fc G IN : W c (k) = * A)ic(fc) }. 

Let the length of epoch k be PS n \k)n, i.e., 

AW(fc)n = at ] n + (3 (n \ tu . + a^n + /3 ( ™ } fuV 

Hence, the length of communication is 

k=i 

3.3 Choice of training sequences 

As described earlier, the transmitter and the receiver agree upon two training sequences, a m and 
cr c , of lengths a m n and a c n, respectively. The optimal choice of such training sequences falls under 
the domain of experiment design for estimating unknown parameters. We assume that we can find 
good training sequences for £2; if not, we choose a simple training sequence that cycles through all 
the channel inputs one-by-one. 

The transmitter and the receiver also agree upon two estimating rules, 6 m and 6 C . For a training 
sequence a of size n and a estimation rule 9, define the estimation error exponent as 

T^ k = lim - -logP € (0(Y n ) = k I X n = a), k,£ = l,...,L (8) 

n— >oo n 

and for £ = 1, . . . ,L, 

T e = lim logP^(y n ) + £ I X n = a) 

n— >oo n 

= min{T e ' k :k = l,...,L,k^£} (9) 

where X n and Y n are the channel inputs and outputs respectively. We are interested in character- 
izing the union of (Ti, . . . , T£) for all choices of estimation rule 9. We call this region the estimation 
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error exponent region and denote it by 3^ . Instead of directly characterizing estimation error expo- 
nent region, it is easier to first characterize pairwise estimation error exponent region — the union of 
(T^' k ;£, k = 1, . . . , T; k 7^ £) for all choices of estimation rule 9; this region is denoted by — and 
then obtain the estimation error estimation region 2F using (|9]). 

Characterizing the pairwise estimation error exponent is equivalent to characterizing the pair- 
wise hypothesis testing exponent for multiple hypothesis testing. The latter was characterized by 
Tuncel [13] for L-ary hypothesis testing with independent and identically distributed observations. 
Let pi be the probability distribution of the observations under hypothesis £. Then, 

ST* = {(T e ' k ,£^ k) : Vp £ A(Y),3k such that D(p\\p e ) > T e ' k for all I £ k} 

For our setup, the observations at the receiver need not be identically distributed. Nonetheless, 
the observations are independent across time, and it is easy to generalize the above region to the 
case of independent (but not identically distributed) observations. We then use ^ to obtain the 
desired region 2? as follows: 

ST = {(Ti, . . . ,T L ) : 3(T £ ' k , £ / k) e ST* such that W, T t = f^ T ik} 

The estimation rules 9 m and 9 C attain particular points in 2?\ denote these by (T mj i, . . . ,T TOj i) 
and (T Ci i, . . . ,T c< l), respectively. Recall that the training sequences a m and a c are of length a m n 
and a c n respectively. Thus, for any epoch k, 

lim — log P e (L m + £) = T m/ , £ = !,..., L; (10) 

n.-s>oo a m n 

and 

l im -logP £ (L c ^£) = T c/ , £ = 1,...,L. (11) 

rwoo Qt c n 

Choose 9m and 9 C such that 

lim F e (L m ^£)=0, lim P e (L c ^ £) = 0; (12) 

n— >oo n— >oo 

and 

T m ,i>0, T c/ >0, £ = l,...,L (13) 



3.4 Choice of codebooks 

As described earlier, the transmitter and receiver agree upon L codebooks. Codebook £ is a 
fixed length codebook for DMC Qi, £ = 1, . . . ,L, with rate £,£Ri//3 m .£ arid length (3 m; in. Choose 
codebook £ such that the error exponent is positive for all rates below capacity, i.e., 

if |^ < C t , then lim -— — logP t (W m (k) + W m (k)) > (14) 

The actual form of the codebook does not matter; for example, it could be a linear code, or a 
convolutional code, or a LDPC code, or a polar code, or a posterior matching code that uses 
feedback. 
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3.5 Choice of control sequences 



As described earlier, the transmitter and the receiver agree upon two control sequences, a a/ and 
ctr^ of length /3 c ,e, for signaling accept (when W m = W m ) and reject (when W m / W m ). Choose 
these sequences as repetitions of xa,i and xr£, the maximally separated input symbols for Qi, i.e., 
the arg max in §5§ for Bq 1 . 

The transmitter and the receiver also agree upon a hypothesis testing rule Oh £ for disambiguating 
(JA,t and ctr^. Let Ha/ and Hr^ denote the error exponents of this rule, that is, 

H A ,i = lim log¥ e (W c (k) ^ W c (k) \ W c (k) = a A ,e); (15) 

n^oo Pc^n 

and 

H R/ = lim log¥ e (W c (k) ^ W c (k) | W c {k) = a R/ ). (16) 

Choose 8fj,e such that 

H A ,e = and H R/ = B e ; (17) 

while 

lim V t (W c {k) + W c {k) | W c (k) = o A ,i) = lim P t (W c (k) ^ W c (k) \ W c {k) = a R/ ) = 0. (18) 



Such a choice of 9jj£ is always possible (see 14 ). 



3.6 Choice of parameters 

The first and second phase of the proposed scheme correspond to the message mode of the Ya- 
mamoto Itoh |12| scheme, while the third and fourth phase correspond to the control mode. In the 
Yamamoto Itoh scheme, the ratio of the lengths of the message and control modes is 7/(1 — 7) 
where 7 = R/C. We choose the parameters such that a similar relation holds for the proposed 
scheme. In particular, let 7^ = R^/Ci; then, we want 

m,£ "ii 

nm — = . 

n^oo a c + p c j 1 - 71 

The parameter ^ is the proportionality constant, that is, 

lim a m + p m i = iat and lim a c + = &(1 - 7*). 

n— >oo n— >oo 

We let one of these proportionality constants to be one and call that channel the reference channel 
Q* • 

In the Burnashev exponent, the slope (i.e., the Bq term in Q) is determined by the "signaling 
exponent" in the control mode. As will become apparent in the proof of Proposition |6j to maximize 
the slope of our exponent, we need to choose the parameters such that 

lim - ilogP/(W" c (l) / W c (l) I W C {1) = a Ri i,L c (l) = £) = lim - - logP<(L c (l) ^ £). 

n— >oc n n— >co n 

We choose the parameters that satisfy the above properties as follows. For £ = 1, . . . ,L, define 
constants 

T c/ Ri , (1-7/) / 10 n 

K * = W ll= c^ Q = JTT^Y (19) 

Let 7*, and C* be the k, 7 and £ parameters corresponding to the reference channel Q*. Then 
choose the parameters of the coding scheme as follows: 
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1. Choose & = (*/Ce- 

2. Choose > such that afn is an integer, lim n _ >00 a„ = while lim^oo a>mn = oo. 
An example for such a choice is a™ = [^/lognj/n. 

3. Choose (ify > iai such that /3^n is an integer, lim^oo /3 m/ = 

4. Choose ai™' ) > such that af^n is an integer, lim^oo = £*. 

5. Choose > such that f3^ is an integer, lim n _ > . oo/ 0^ = k^C*. 

3.7 Consequences of the choice of parameters 

The choice of the parameters a m , a c , f5 m £, /3 c> g, and £ = 1, . . . , L implies the following: 
Lemma 1 (Length of message and control phases) For every £ = 1, . . . , L, we have that 

lim a m + fi ml = and lim a c + /3 ci = &(1 - j e ). □ 

The choice of the estimation rules 9 m , 6 C , the codebooks, and the hypothesis testing rules 9n,e, 
£ = 1, . . . , L, implies the following properties: 

Lemma 2 For every k G IN and £ = 1, . . . , L, we have that 

lim - — logP € (L m (A;) + I \ K > k) = T m/ ; 
rwoc a m n 

l im _ _±_ l gP^(L c (A:) ^ £ \ K > k) = T c/ ; 

lim - _^logP,(^ m (fc) / W m {k) | L m (k) = £, K > k) > 0; 

n->oo Pm,t n 

lim - logP/CWcCfc) / VF c (fc) | L c (fc) = W c (fc) = o^, K > k) = H A/ = 0; 
lim - -L- logP / (W c (fc) / VF c (fc) | L c (fc) = W C (A:) = a^, K >k) = H R/ = B e . 

n-Hx> fi c/ n 

An immediate consequence of the above is that each of the error probabilities approach zero as 
n — > oo. Specifically, 

Lemma 3 For every k £ IN and £ = 1, . . . , L, we have that 

lim F e (L m (k) / £ | K > k) = 0; 

lim F e (L c (k) ^£\K>k) = 0; 

lim P^VMfc) + W m (k) | L m (fc) = £, K > k) = 0; 

n— »oo 

lim PK^c(fc) / VF C (A:) | L c (fc) = £, W c (k) = a A/ , K > k) = 0; 

lim F e (W c (k) + W c (k) | L c (k) = £, W c (k) = a R/ , K > k) = 0. □ 
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4 Performance analysis 
4.1 Some preliminary results 

Recall that the length of epoch k £ IN is A(k)n. Thus, 

A(fc) = a m + (3 mXm(k) +a c + P cMk) - 
Combining Lemmas [T] and [3j we get the following: 
Lemma 4 For every k G IN and £ = 1, . . . , L, we have that 

lim E/[A(A:)] = 6- (20) 

n— >oo 

77ms, for large n and realized channel Qi, the expected length of each epoch is ^n. □ 
Let 

£r = {o-R,e,£= 1, . . . ,L} 

(n) 

denote the set of all reject control signals and let p\ denote the probability that the estimated 
control sequence in epoch k is in S/j, i.e., 

PI = Pe(W c (k) eZ R \K>k) (21) 

Due to symmetry across each epoch, pi does not depend on k. 

Conditioned on the event that K > k, communication stops at epoch k if the estimated control 
sequence W c (k) is REJECT. Hence, 

P e (K = k | K > k) = p t . 

Consequently, we have the following: 

Proposition 2 For any n £ IN and £ = 1, . . . ,L, the number of retransmissions has a geometric 
distribution; in particular, 

P e (K = k)=p e (l-p e ) k -\ fcelN (22) 
Furthermore, Lemma^implies that 



(n) 

r 

n— ¥oo 

Hence, 



lim pf> = 1. (23) 



lim P^(AT (n) = 1) = 1. (24) 

n— >oo 

Thus, for large n and irrespective of the realized channel, the expected number of transmission 
epochs is one. □ 

4.2 Expected length of communication 

Proposition 3 For every £ = 1, . . . , L, 

hm-E,[r W]=& (25) 

ri— »oo n 

□ 
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PROOF Since r = J2k=i A(fe)n, we get 

K 



n 



-E/[r] = E* 



fc=i 



P € (tf = 1)E € [A(1)] + P £ (i^ > 1)E< 



K 



fe=i 



K > 1 



Proposition [2] implies that 



lim -E/[r] = lim E*[A(1)] = & 

n— >oo Ti n— >oo 



where the last equality follows from Lemma |4j 
4.3 Probability of error 

Proposition 4 For any n G IN and I = 1, . . . , L, the probability of error is given by 



P. 



(n) 



1 

Pi 



P|(W m (l) + W m (l))P e (W c (l) + W c (l) I W c (l) E E fl ) 



(26) 

□ 



PROOF The error event is {W m (K) / Wi,^}. For each k £ IN, W m (fc) = W^ (k) . Using this to 



simplify the probability of error, we get that 
P £ (n) = V e (W m (K) / W m (*0) 

oo 

= £ / K = k) 



OO 



fc=l 



( = } PKWm(l) / W m (l) | if = 1) jr P^K = k) 

(b) 1 



fc=l 



1 



-P/(Wm(l) + W m (l))P e (K = 1 I W m (l) + W m (l)) 
Pi(W m (l) / W m (l))P e (W c (l) + W c (l) I W c (l) G E B ) 



p,(^ = l)" 

where (a) follows from the symmetry across epochs and (6) follows from Bayes rule. ■ 
4.4 Opportunistically achievable rate 

Proposition 5 The coding scheme of Section^opportunistically achieves the rate vector (R\, . . . , Rl)-o 

Proof To prove the result, we need to show the proposed scheme satisfies the properties described 
in Definition [4} Specifically, 

lim E £ [r] = oo; (27) 



along with 



lim = R e ; 



(28) 
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and 



lim Pi n) = 0. 



(29) 



We prove these separately. 

(a) Property (27) follows from Proposition [3j 

(b) Recall that Mi = exp 2 (n£eRe)- Hence, 

M^gM Lm{K) ] 



Proposition [3] implies that 



Ei[t] 



n 



L m (K)^L m (K)! E£ [ T y 



K e [\ogM L , K) ] i 
lim — , , m = — lim E/[£* i^Rt 

rwoo Ei[r] ^ { n->oo l ^L m (K) L m (Jf)J' 



(30) 



Now, 



mt m{ K)RL m{ K)] = = m^L m{1) Ri mil) } + Vi(K > l)Mil m{K) RtUK) I K > i] 

Using Proposition [2] we get that 

}™^ im{K) R im{K) ] = Km E,fe m(1) ^ ro{1) ] (31) 

Now, 

^^ m (i)^ m( i)] = p*(£»(i) = «£ £m(1) ^ ro(1) I = <l 

+ P<(L m (l) ^ l)^ lm{1) R Lm{x) | L m (l) ^ 

Using Lemma [3] we get that 



Substituting (31) and (32) in (30) gives (28) 



c) Property (29) follows substituting the results of Proposition [2] and Lemma [3] in Proposition |4ji 



4.5 Error exponent region 

Proposition 6 For a particular choice of estimation rule 9 C , the i- component of the error exponent 
(E±, . . . , El) of the coding scheme of Section^ is bounded by 



El > -J*- Bt (l - 7 /) = -^^B e ( 1 - ^ 



(33) 



By varying the choice of 9 C , we get 
£(Ri,...,R L )D |J 



T c,l B (i_ E ± 

T c a +B\ V C\ ) 1 ' T c l + Bl 



^ Bi (!-!)) ,34, 

□ 
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Proof Consider the expression for Pi in Proposition |4j Taking logs, we get 



logP, = - -logP/(W m (l) ^ W m (l)) 



+ 



n 

-logT e (W c (l) + W c (l) | W c (l) e Eh) 
n 

log Pi 
n 



(35) 



Consider the three summands in the RHS of (35). First consider the first term of of the RHS 
of (35). From Lemma [3j we have that 



lim - - logP*(Wm(l) ¥= W m {\)) > 0. 

n— yoo n 



(36) 



Next consider the second term of the RHS of (35). 
P/(Wc(l) + W c (l) | W c (l) E E fl ) 



< P,(W / C (1) / W c (l) I W c (l) = a R/ , L c (l) = + P|(4(l) ^ *) (37) 



From Lemma [3j we have that 



and 



lim - -logP/(Wc(l) + W c (l) | W c (l) = a R4 ,L c (l) 

n— >oo 71 



lim - - IogP/(L c (l) +t) = (*T c , e = (*K e Bt. 

n— >oo fi 



K eC*Bf, 



(38) 



(39) 



where the last equality follows because ki = Tc^/Bg. Substituting (38) and (39) in (37), and taking 
logarithms and limits, we get 



lim - - logP^Wc(l) ^ W e (l) | W C {1) £ E fl ) > &KtB t . 

n— >oo n 

Next consider the third term of the RHS of (35). From Proposition [2j it follows that 

lim ^ = 0. 



Substituting the result of (36), (40), and (41) in (35), we get 



1 



n 



log Pi? > (*KtBi. 



Combining this with Proposition [3j we get 



Et[r] 



log Pi > (*kiBi/& 



(40) 
(41) 
(42) 
(43) 



The result follows by observing that & = C*/0> an d substituting the value of ki and (i in (43). ■ 

The choice of operating point on the EER boundary depends on the objective. For given positive 
constants wi, . . . , wl, two possible objectives are to minimize the weighted probability of error 

P := {w x P x + ■■■ + w L P L )/{wi + --- + w L ) 

or maximize the weighted error exponent 

E := (wxEx + ■■■ + w L E L )/{wi + --- + w L ) 

As n — > oo, each of Pi, . . . , Pl decay to zero exponentially. Thus, minimizing P is equivalent to 
maximizing min-fi^i, . . . , El}. The choice of the operating point (E\, . . . , El), and hence the choice 
of 9 C , depends on the objective. 
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4.6 Capacity 

Proposition [6] implies that for any rate vector (Ri, . . . , Rl) such that Ri < Ce, £ = 1, . . . , L, each 
component of the probability of error (Pi, . . . , P£) goes to zero as n — > oo. Thus, 

^of(^) 3 {(Ri, ■ ■ ■ , Rl) ■ < Re < C t , £ = 1, . . . , L). 

Furthermore, if a coding scheme (opportunistically) achieves rate Re when the realized channel 
Qo = Qe, then the same scheme will also achieve rate Rg when used over DMC Qg. Thus, 

«of(.2) Q {(R h . . . , R L ) : < Re < C t , £ = 1, • . . ,L}, 

Combining these two bounds, we get 
Corollary 1 The opportunistic capacity region is given by a hyper-rectangle 

V OF (£) = {(Ri, ■ ■ ■ , Rl) : < Re < C e , £ = 1, ■ ■ • , L). 

We call 'log := (C\, . . . , Cl) as called the capacity vector of the compound channel £2. □ 

5 An example 

Consider a compound channel consisting of two BSCs with complementary crossover probabilities, 
p and (1 — p), where < p < 1/2 and p is known to the transmitter and the receiver. Denote this 
compound channel by 

% := {BSC P , BSd-p} 

where BSC V denotes a binary symmetric channel with crossover probability p. For convenience, 
we index all variables by p and (1 — p) rather than by 1 and 2. For binary symmetric channel, the 
capacity and the zero-rate Burnashev exponent are given by 

C p = d_ p = 1 - h(p) 

and 

B p = B^ p = D(p\\l - p) 

where h(p) = —plogp — (1 — p) log(l —p) is the binary entropy function and D(p\\q) = plog(p/q) + 
(1 — p) log((l — p)/(l — q)) is the binary Kullback-Leibler function. Assume that the desired 
communication rate is (R p , R±~ p ), where R p < C p and Ri- P < C\- p . 

Choose the training sequences a m and a c as all zero sequences of length a m n and a c n. Choose 
the channel estimation rules 6 m and 6 C as the threshold tests: if the empirical frequency of ones in 
the output is less than q, p < q < 1 — p, estimate the channel as BSC P ; otherwise, estimate the 
channel as BSC\- p . The thresholds for 9 m and 6 C are q m and q c respectively. For such a threshold 
test, the probability of estimation error is bound by the tail probability of a sum of independent 
random variables. From Hoeffding's inequality [15| Theorem 1], the exponents of the estimation 
errors are 

T m , P = D(q m \\p), T m ,x- V = D{q m \\l -p), T CjP = D(q c \\p), T c ^ p = D(q c \\l - p). 

Choose the two codebooks as any codebooks for BSC P and BSC\~ P that have positive error 
exponents. 
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Choose the control sequences c?A,p and <tr jP as /3 C)P n repetitions of zeros and ones, respectively. 
Similarly, choose the control sequences cta,i-p anci &r,i-p as Pc,i-p n repetitions of ones and zero s 
respectively. The hypothesis testing rules 9h, p and 9h,i-p are chosen as described in Section 



3.5 



H-p 



/Bi-p 




Ep/B p 



Figure 1: The scaled EER region $(p) = {(Ep/B^E^p/B^p) : (E p ,Ei_ p ) £ S{Rp,Ri^ p )} for 
different instances of the compound channel =2 P := {BSC P , BSC\- P }. 

Proposition [6] implies that for any rate vector (R p , Ri—p) and a particular choice of the estimation 
threshold ^cj^jthe above scheme achieves an error exponent {E p ,E\^ p ) such that 



D(q c \\p)D(p\\l - p) 



Ep ~ D(q c \\p)+D(p\\l-p) 



(1-7p), 



D(q c \\l-p)D(p\\l-p) 
El - P ~ D( qc \\l- P )+D(p\\l-p) {1 7l - p) 

where j p = R p /C p and 71 _ p = R\- p jC\- p . 

There are no known upper bounds on the EER. Hence, we compare with the trivial upper bound 
of the Burnashev exponent of BSC P and BSC\- P . 



> 



D(q c 



B P ~ D(q c \\p)+D(p\\l-p)- 



£1- 



l-p 



B\- p 



> 



D(q c \\l-p) 



Let 



'■Pip, Qc) ■= 



E„ E 



p 

BJ Bi 



i-p 



> 



D(q c \\l-p) + D(p\\l- P y 

D(q c \\p) D(q c \\l-p) 



D{q c \\p) + D{p\\l - p)' D{q c \\l - p) + D{p\\l - p) 



The choice of q m does not affect the values of E p and E\- p as long as ¥i(L m 7^ t) — > 0. For that, we require only 
that p < q m < 1 — P- Choosing q m = 0.5 ensures that. 
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and 



*(p) : = {( J'l^;) : (**>^Hp) e ^ P ^i- P )} 

= {<p(p, q c )-P<q c < p} 

For the scheme proposed in Section [3] ^(p) does not depend on the transmission rate (R p , Ri- P ). 
We plot for different values of p in Figure [I] 

6 Conclusion 

In the presence of feedback, not knowing the exact channel transition matrix does not result in 
a loss in capacity. As a result, we can provide an optimistic rate guarantee: any rate less than 
the capacity of the realized channel is opportunistically achievable, even though we do not know 
the realized channel before the start of communication. This is in contrast to the pessimistic 
rate guarantees in compound channel without feedback. More importantly, any rate vector in the 
optimistic capacity region can be achieved using a simple, training-based coding scheme. The error 
exponent of this scheme has a negative slope at all rates in the capacity region, even at rates near 
the boundary of the capacity region. 

Our proposed proposed training based scheme is conceptually similar to Yamamoto-Itoh's scheme. 
It operates in multiple epochs; each epoch is divided into a message mode and a control mode. A 
training sequence is transmitted at the beginning of each mode, and the corresponding channel 
estimate determines the operation during the remainder of the mode. 

It may appear that the proposed scheme can be simplified by combining the training phases 
in each epoch, i.e., have a training phase followed by message and control modes. However, as 
argued by Tchamkerten and Telatar in [16], such a simplification will lead to error exponents that 
have zero-slope near capacity. Our results do not contradict the results of [16] because we allow 
for more sophisticated training. Re-training in the control mode ensures that the error events 
{W m (k) / W m (k)} and {W c (k) / W c (k)} are independent, which, in turn, is essential to obtain 
an error exponent of the form Bi(l — 7^). 

One possible way to make the scheme more efficient is to accumulate the training sequences for 
each phase, i.e., the channel estimation for the message mode and the control mode is based on 
all past training sequences for that mode. Such an accumulation will improve the finite length 
performance of the scheme, but does not affect the asymptotic performance because, in the limit, 
the communication lasts for only one epoch with high probability. 

Another possibility to improve the performance of the coding scheme is to use a universal coding 
scheme for the control mode rather than a training based scheme. This motivates the study of the 
following communication problem. 

Open Problem Consider the communication of a binary valued message over a compound channel 
with feedback. Let JS = {Qi, ■ ■ ■ ,Ql} denote the compound channel, W £ {#o>#i} denote the 
message, Xt and Yt denote the channel inputs and output at time t, and W denotes the decoded 
message. Consider a variable length coding scheme (c, g, r), where ct is the encoding function at 
time t, gt is the decoding function at time t, and r is a Y l -measurable stopping time. The decoded 
message is 

W = g T (Y 1 ,...,Y T ). 
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Let ai n and bi n denote the exponent of the two types of errors, i.e., 

< ^ -\ozVi(w = e l \w = e ) .... 

«fc(c )g ,T) = Et[T \ W = o] ■ £ = 1,-..,L, (44) 

, ( s -iogF,(^ = ^o \w = e l ) 

Mc,g,r) = ^ [T | W = f9l] > < = 1,-,J. (45) 

where is i/ie induced probability measure when the true channel equals Qi. 
For a sequence S = {c^ n \ g^ n \ r^ n ^}^ =1 o/ coding schemes such that 

lim E £ [r (n) | W = 0*] = oo, i = 0, 1, ^ = 1,...,L. 

define the type-I and type-II error exponents of S as 

a h = lim afefcW.gW.rW), 
b e *= lim Mc>), g^T^). 



Furthermore, define 



max be*(S). 

S:a fa (S)=0 



WTiai is i/ie 6esi type-II exponent (b*, . . . , b* L ) ? 



Tchamkerten and Telatar [17] studied a similar problem and identified necessary and sufficient 
conditions under which 

b*e = Bqv I = 1, ■ ■ ■ ,L. 

We are not aware of the solution to the above problem when the conditions of |17i] are not satisfied. 

Given any sequence S of coding schemes for Problem [TJ we can replace the control mode (phases 
three and four) of the proposed coding scheme by S and achieve an error exponent of 

(MS)(1-7i),...,MS)(1-7l)). 

If S is optimal, the error exponent is 

(&t(S)(l-7i),...,^(S)(l- 7 L)). (46) 



We conjecture that no coding scheme can achieve a better error exponent, i.e., (46) is the Pareto 
frontier of the EER. 

When the conditions of [TTJ are satisfied, we can replace the control mode by the variable length 
coding scheme proposed in |17|, and thereby recover the result of 17). In fact, in that case, our 
modified scheme is exactly the same as the variation proposed in 17) Section IV-B]. When the 
conditions of |17| are not satisfied, the scheme proposed in this paper provide an inner bound on 
the error exponent region. To find the best error exponents, we need to solve Problem 1. 

In this paper, we presented an inner bound on the EER when the compound channel is defined 
over a finite family. Generalization of the coding scheme to compound channels defined over con- 
tinuous families is an important and interesting future direction. We believe that solving Problem [T] 
is a critical step in that direction. 
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